Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: The great split #353

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

WIP: The great split #353

wants to merge 11 commits into from

Conversation

sebhoss
Copy link
Member

@sebhoss sebhoss commented Nov 30, 2024

So in the past, this project had to deal with various size related issues, e.g. in the very first version I was running with all features enabled and quickly realized that my computer does not have enough memory. I solved this by running commands like cargo check in a loop against each feature individually instead of running against the entire project with all features enabled. This worked for a while, however once more and more custom resources were added and thus the number of features grew as well, we ran into a limit of crates.io which only allows 300 features per crate nowadays. Originally, each version in each custom resource group had its own feature, which was changed so that each group is mapped to one feature and each version in that group is one module. Additionally, I contacted the crates.io team to increase the feature limit for this project, so that I can continue to grow past 300 custom resource groups. Lately, we are running into the 10 MiB per crate size limit of crates.io. Again, I contacted the crates.io who again were very helpful and quickly increased the size limit. However, both the feature and size limits will be reached again eventually since the Kubernetes community is very active and creates more and more custom resources all the time.

The idea here is to come up with a solution to allow this project to grow indefinitely. The rather simple idea in this PR is to split all custom resource groups into their own crate. Each version within a group continues to be a module within that crate. Since each group typically only contains a handful of resources, I do not see any size related problems happening in the future (although it's possible in theory..). Each crate will be versioned and released individually and the release schedules are spread across an entire day, so that we won't hammer the crates.io API with hundreds of crates at the same time.

Open questions:

  • Should we introduce the feature-per-version from the original version of this project again? I think this can help users, since groups often expose the same custom resource in various versions but users are typically only using one version. By introducing features, users can make sure that they will only ever use the version they want to use and not accidentally use something else.
  • Is the crates.io team ok with this? We would be releasing 400+ crates and that number will only ever go up. Beside the high number of crates and the potential API bottleneck, I'm concerned about name squatting since crates.io does not have a namespace/group concept. My idea here is to prefix each crate with something that does not exist yet in crates.io and hope for the best 🤷 (the current implementation uses the kcr_ prefix which stands for kube-custom-resource). The crates.io team have asked me in the past to limit the number of releases since the high number of features in the original crate causes too much strain in their API.

Feedback highly welcomed here. I'm pinging @w3irdrobot, @waynr, @banool, @brendanhay and @orf since you guys either pushed changes in this repo or somehow interacted with it in the past and I'm assuming you are using this project. Likewise, @clux since he created kopium and maintains kube-rs. I'm going to contact the crates.io team by mail since I don't not know any of their usernames on GitHub (feel free to ping them here if you do?)

Ignore the build failures - this is WIP. If you want to see what this currently looks like open https://github.com/metio/kube-custom-resources-rs/tree/split since the diff view in this PR will be completely useless 😅

Signed-off-by: Sebastian Hoß <[email protected]>
@sebhoss sebhoss added bug Something isn't working documentation Improvements or additions to documentation enhancement New feature or request help wanted Extra attention is needed github_actions Pull requests that update GitHub Actions code labels Nov 30, 2024
@sebhoss sebhoss self-assigned this Nov 30, 2024
@orf
Copy link

orf commented Dec 2, 2024

This sounds awesome @sebhoss!

Should we introduce the feature-per-version from the original version of this project again

I think that's a good idea - it seems sensible, and can reduce the work the compiler has to do as well?

On that note, this should significantly reduce compilation time for this crate. Even with only a couple of features selected, it was pretty heavyweight!

A couple of notes:

  • you can do custom-resources/* in the root workspace, so you don't need to maintain that big list of packages
  • we could use workspace dependencies rather than duplicate them? This is:
dependencies.workspace = true

In the custom resource TOML files, then add:

[dependencies]
schemars = { version = "~0" }
serde = { version = "~1" }
serde_json = { version = "~1" }
k8s-openapi = { version = "~0" }
kube = { version = "~0", features = ["derive"] }

To the workspace Cargo toml file.

If that doesn't work, you can do it individually:

[dependencies]
schemars = { workspace = true }
serde = { workspace = true }
...

You can also do the same for package attributes:

[package]
authors.workspace = true
description.workspace = true
...

This would prevent the need to re-generate code when any of the workspace metadata changes?

@orf
Copy link

orf commented Dec 2, 2024

When it comes to releasing the crates, is this really an issue? We won't be releasing all the crates every time, only when the code within changes?

If so, then this issue only applies to the first release - and we can mitigate that by just initially releasing slowly.

@orf
Copy link

orf commented Dec 2, 2024

Last comment/idea, sorry for the spam!

Rather than version/release each crate individually and per resource, we could group them by namespace instead?

I.e rather than have ~40ish *_k8s_aws crates, we could have a single crate for k8s_aws with all resources inside?

From a UX point of view, you're more likely to use CRDs in these "groups", so you probably want all of the AWS resources available to you.

This could be grouped by the last two parts of the CRD? So sqs.services.k8s.aws would become k8s.aws?

But maybe this isn't always safe, idk. At the very least it might make sense for some specific hard-coded list of groups? AWS in particular?

@sebhoss
Copy link
Member Author

sebhoss commented Dec 2, 2024

Thanks for the feedback - I'll adjust the workspace related stuff as per your comment in my next push ❤️

Wrt. releasing crates: I'm not sure if that will be an issue but I don't want to be a bad citizen on crates.io so spreading releases out a bit won't hurt? My current implementation just calculates the hash of the group name and then mods them with 24 and 60 to get hours and minutes.

I like the idea of grouping crates even further to reduce the total number of crates! Not sure if we can use the last two segments of their group though, e.g. https://github.com/metio/kube-custom-resources-rs/tree/split/crd-catalog/aws sometimes uses amazon.com or k8s.aws and even some special cases like karpenter.sh. I thought about using the GitHub/GitLab organization name to group crates (e.g. just aws for the previous example), however there are some special cases like https://github.com/metio/kube-custom-resources-rs/tree/split/crd-catalog/aws-controllers-k8s which should probably belong to the aws group as well or https://github.com/metio/kube-custom-resources-rs/tree/split/crd-catalog/apache which contains resources for multiple disjointed projects and it might be weird to group them together? That said, the crd-catalog is manually maintained anyway, so manually grouping the aws and aws-controllers-k8s orgs into aws isn't really a showstopper. Likewise, we can manually split the apache org into their individual projects.

If we go down this road, how would this work with features though? Would we have one feature for each version in each group in each project? Using https://github.com/metio/kube-custom-resources-rs/tree/split/crd-catalog/aws/amazon-cloudwatch-agent-operator/cloudwatch.aws.amazon.com/v1alpha1 as an example, would we have a feature called cloudwatch_aws_amazon_com_v1alpha1 in the aws crate? Should we add cloudwatch_aws_amazon_com as well in case someone wants to enable all versions of that group at once?

@orf
Copy link

orf commented Dec 2, 2024

Hmm, on second thought perhaps it's best to not group them. It would work for larger groups of packages, but for smaller ones it would be confusing to add a package then enable a feature with the same name?

It would also increase update churn - you're not going to be sure your specific CRDs are updated when you update a package, whereas with a single package per CRD you can be?

Signed-off-by: Sebastian Hoß <[email protected]>
Signed-off-by: Sebastian Hoß <[email protected]>
@clux
Copy link

clux commented Dec 10, 2024

Hey! I think this is a good idea also. Less mega crates that does everything. Everywhere I go I end up inlining kopium output into some vendor-equivalent directory, and this would help with that. Group names make sense to me, provided you don't run into some crate name length limit.

My expectation on load is that given how you've set it up (only publish when there are commits touching the directory), it'll likely end being less load altogether on crates.io as there will be less stuff released, but maybe there's some common overhead you incur. Definitely see much bigger offenders in the crates ecosystem in the build queue every now and then though (like repos releasing 100+ crates at every update unconditionally; those seem to get downprioritised in the queue).

Maybe the calendar versioning (masquerading as semver) that you inject with sed should be avoided in favor of something like cargo-release or cargo-workspaces with a blanket minor release - if it's easy to install in ci via something like the taik-e/release-action. It's not like the numbers are going to be super meaningful either way, but people do write github action logic around auto-merging based on whether or not something is a major version, and this would make "one major every year" rather meaninglessly.

@sebhoss
Copy link
Member Author

sebhoss commented Dec 22, 2024

Hey all - thanks for the feedback & sorry for the delay here. I was busy moving to another city and life I guess 🤷

I agree with the package/feature name duplication. It doesn't really feel right using this, so I've switched back to one crate per custom resource group and one feature for each version within that group. I've added a default feature as well which enables all versions because the crates.io UI has this quick copy&paste box that does not contain any features. My thinking here was that people should be good to go by just pasting the string they got from crates.io into their project while still allowing them to fine-tune the enabled features later.

The crate length limit should be fine (at least right now). The maximum is 64 chars and we are at 43 with the kcr_ prefix.

@clux can you clarify what a version number would look like in your idea? Would we always increment the major version and set minor/path to 0? So like going from 1.0.0 to 2.0.0 to 3.0.0 on each release? My original idea behind the calendar based versioning was to give some meaning to the version number since users can quickly see how old their CRDs are. However, I agree that the current versioning scheme has no meaning for auto-merging workflows that expect semver based behavior.

I've replaced the sed call with cargo set-version which seems to work fine. cargo-release won't work I think because I'm changing the version number on the fly and thus do not have a clean working tree to release from. That said, I'm not that familiar with cargo-release so maybe I have to replace larger parts of the current release process to make it work?Likewise, the current setup uses a release pipeline for each crate and therefore cargo-workspaces seems like the wrong tool to use, but maybe the release process could be changed to integrate better with cargo-workspaces as well?

About releasing: How about we do not release to crates.io at all and instead tell users to declare a git dependency? I think the advantages are:

  • zero load on crates.io
  • no crate name prefix required & no potential name clashes in the future
  • simplified release process since we just create a new git tag every week or so

Can't really think about strong downsides. Maybe it feels weird because every other crate is fetched from crates.io?

Signed-off-by: Sebastian Hoß <[email protected]>
@clux
Copy link

clux commented Dec 22, 2024

@clux can you clarify what a version number would look like in your idea? Would we always increment the major version and set minor/path to 0? So like going from 1.0.0 to 2.0.0 to 3.0.0 on each release? My original idea behind the calendar based versioning was to give some meaning to the version number since users can quickly see how old their CRDs are. However, I agree that the current versioning scheme has no meaning for auto-merging workflows that expect semver based behavior.

I would imagine making minor releases automatically (1.X++.0), and reserving majors for something unexpected that you actually want people to look at (if it should occur). People can already see how old things are from crates.io last publish or github release/tag dates. To be clear, I like the intention of giving at-a-glance information, but I don't like underming an existing system with existing conventions (semver, where majors should be looked at) to present it.

How about we do not release to crates.io at all and instead tell users to declare a git dependency?

This does mean people have to whitelist git dependencies in tools such as cargo-deny. It's occasionally a tricky thing to get buy-in for in some security focused organisations because people can (and do) overwrite git history. A crates registry pin is also more permanent (e.g. survives yanks).

@sebhoss
Copy link
Member Author

sebhoss commented Dec 22, 2024

Ok thanks for clarifying!

@orf
Copy link

orf commented Dec 22, 2024

Another major downside of using git dependencies is that you’re unable to publish crates with git dependencies onto crates.io.

Regarding the versioning, I’m of two minds. On one hand, as a user, I did value the date based (calver?) approach, and I’m also not entirely sure I’d want my CRDs updated “under my feet”, so to speak.

I never experienced this personally, but it seems like it’s possible that existing CRD definitions can change in a way that causes my code to either break or refuse to compile. This would not be semver compatible.

And we’re not actually versioning Rust code here, not really, we’re versioning collections of very loosely versioned external “things” that don’t abide by server at all.

This leads to two possible options:

  1. Assume every change is breaking - onus is on the user to update - calver
  2. Do not assume anything about changes - onus is on the user to fix any abrupt breakages - “semver” (but not really)

I’m not sure what is best: both have advantages and disadvantages, but it’s not as clear cut as just “use semver”.

@sebhoss
Copy link
Member Author

sebhoss commented Dec 22, 2024

Yeah I think there is no clear best way to version the crates in this repo.

I basically outsourced that problem to the upstream projects in my original design, e.g. if they introduce a breaking change, they would have to increase the version of their CRD which would result in an entirely new rust struct to be generated while still keeping the old struct around. Users would then have a choice to update their code to the latest CRD version or keep using the old one as long as they like. However updating versions of CRDs seems to be mostly done by humans and therefore errors are to be expected. Again I kinda outsourced that problem to upstream projects by just ignoring it in this repo and using calver to have an increasing version number and a simplified release process 😬

Additionally, there is still #289 around which describes a breakage introduced by updating dependencies. I've been toying around with cargo-release, cargo-workspaces and how we could version/release this repo for a bit now and maybe there is at least something we can improve compared to the original design. My proposal is to use 1.X.Y as a version number with X being the output of date --utc +'%Y%m%d' and Y being the output of date --utc +'%-H%M%S'. The resulting version number right now would be 1.20241222.162630. The advantage here is that we can increase the major version for technical reasons, e.g. whenever kube-rs releases a new major version which might help with #289 in the future, the calendar based minor version would still show how old a crate is (although it's a bit harder to read) and using clock based patch version allows us to release multiple times a day if necessary. Users can auto-merge on minor/patch updates and still be notified when something major changes.

The release process would basically stay the same: We spread the publication of crates in the repo throughout a day to combat limits like pksunkara/cargo-workspaces#177 and do not commit versions back into git. Instead all crates get a 0.0.0 version number which will be incremented on the fly with cargo set-version ... before calling cargo publish. No release on GitHub and no git tag for each release.

Signed-off-by: Sebastian Hoß <[email protected]>
Signed-off-by: Sebastian Hoß <[email protected]>
Signed-off-by: Sebastian Hoß <[email protected]>
serde_yml seems to be AI maintained and runs into an error like this:

Downloading https://raw.githubusercontent.com/jenkinsci/kubernetes-operator/master/config/crd/bases/jenkins.io_jenkins.yaml
thread 'main' panicked at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/libyml-0.0.5/src/scanner.rs:2798:17:
String join would overflow memory bounds
stack backtrace:
   0: rust_begin_unwind
             at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/std/src/panicking.rs:665:5
   1: core::panicking::panic_fmt
             at /rustc/e71f9a9a98b0faf423844bf0ba7438f29dc27d58/library/core/src/panicking.rs:76:14
   2: libyml::scanner::yaml_parser_scan_flow_scalar
   3: libyml::scanner::yaml_parser_fetch_flow_scalar
   4: libyml::scanner::yaml_parser_fetch_next_token
   5: libyml::scanner::yaml_parser_fetch_more_tokens
   6: libyml::parser::peek_token
   7: libyml::parser::yaml_parser_parse_block_mapping_value
   8: libyml::parser::yaml_parser_state_machine
   9: libyml::parser::yaml_parser_parse
  10: serde_yml::libyml::parser::Parser::parse_next_event
  11: serde_yml::loader::Loader::next_document
  12: <serde_yml::de::Deserializer as core::iter::traits::iterator::Iterator>::next
  13: <serde_yml::de::Deserializer as core::iter::traits::iterator::Iterator>::next
  14: crd_v1_fetcher::parse_crds
  15: crd_v1_fetcher::main
  16: core::ops::function::FnOnce::call_once

Signed-off-by: Sebastian Hoß <[email protected]>
Signed-off-by: Sebastian Hoß <[email protected]>
Signed-off-by: Sebastian Hoß <[email protected]>
Signed-off-by: Sebastian Hoß <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working documentation Improvements or additions to documentation enhancement New feature or request github_actions Pull requests that update GitHub Actions code help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants